-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement "regex" expression that returns an array of match groups or… #6228
Conversation
… null if no match.
Thanks for getting this started, @mkv123! Some initial thoughts/questions:
cc @mapbox/gl |
Hi Anand and thanks for the input.
|
This is the case for ICU, which underlies the most obvious regular expression implementations that mbgl or the native SDKs would use. |
Updated the pull request to use "regex-match", "regex-test" and "regex-replace" for the three different uses. Added basic tests for them as well. As regex replace is a string method in javascript and in ICU, I'm not entirely sure what would be the best order for the "regex-replace" arguments. I do think the regular expression string should be the first argument on a regex operation. Current order in implementation is expression, flags (optional), replacement string, original string, which happens to match the order for python's re.sub() for example. I've looked at the regex standards and hunted for a nice comparison between them to establish what the differences in supported syntax are, but haven't really been able to find one. For reference: |
This table on Wikipedia is a good starting point. For the purposes of map styling, I think the most notable differences are the lack of (fixed-width) lookbehind, named capture groups, or Unicode character classes in JavaScript regular expressions. For example, JavaScript also has a hack in which /ref #4089 (comment) |
I think detecting and preventing the use of syntactic features that aren't supported by both ICU and JS is probably okay, but it's less clear to me what we can/should do about the differing behavior for things like In terms of API design, how about:
["replace",
"philadelphia, pa",
["regexp", "(\w+),\s*(\w+)"],
[ "concat", ["group", "0"], " ", ["upcase", ["group", "1"]] ]
] |
Of course, it's not clear exactly what each of those should do. There's a number of possible use cases, and different languages make differing choices about which are conveniently built in, or which names to use. I surveyed JS, C++
Other random notes: Note that a full-fidelity match result is For replace, I'm not fond of
The first is preferred, since it's a generally useful feature, and we try to avoid meta-syntaxes. |
For completeness, here’s a corresponding table for the various regular expression–related APIs in Objective-C and Swift: Regular expression APIs in Objective-C and Swift
|
I tried regex-test in a filter and it works very well, great work! Hope this will be added to a standard release asap. One thing I'm missing is the possibility to make the regex case insensitive with the 'i' flag in RegExp() |
I think we should hold off on first-class functions until we've seen one or two more concrete use cases where we'd want to use them. (I can hypothesize such use cases, but first class functions add enough complexity that I'd want to have some confidence that they're going to happen in practice.) Agreed on ideally avoiding a meta-syntax (though at least this one would be a pretty well-established one).
A variation that would address the former, but not the latter, issue would be to have |
I have tried regex-test a bit more as a filter, and it works well when used as the only filter, e.g.:
but when adding more than 1 filter, e.g:
it breaks with the following error: evented.js:121 Error: layers.id-b233f807-e131-4b3a-a15e-6d03a5346d40.filter[1][0]: expected one of [==, !=, >, >=, <, <=, in, !in, all, any, none, has, !has], "regex-test" found SOLUTION: Added
in filter_operators in src/style-spec/reference/v8.json |
Any recent movement on this? It would be a very valuable addition. |
I'm going to close this PR because it's stale. Let's continue the discussion in the original ticket #4089. |
Too bad, it's a perfectly working solution (with my above addition) which I have to manually merge in to a forked version every time a new version of Mapbox is released. |
@anderslarssonvbg please read this ticket, and #4089 (comment) for rationale. |
… null if no match.
Implements a "regex" expression (#4089 ) which takes two or three string parameters: First is the regular expression to use, second (optional) is flags and third is the source string. Also added some basic tests for the regex feature.
As this returns the match groups, it works for both checking for matches or search and replace type of functionality (#4100 ) by using let, array access and concat operations.
As on failure this returns null, it works nicely with coalesce, in case you need to specify multiple different cases of regexes.